MSDS458 Research Assignment 02 Part 2 - TSNE

More Technical: Throughout the notebook. This types of boxes provide more technical details and extra references about what you are seeing. They contain helpful tips, but you can safely skip them the first time you run through the code.

You are welcome to use the CIFAR-10 data for this exercise. You are welcome to use Python with user-defined functions, Python with TensorFlow, and/or Python with Keras. For example, you can conduct the following experiments on the CIFAR-10 data. The goal is to compare DNN and CNN architectures. In all the experiments, you may hold some parameters constants - for example, the batch size to 100, the number of epochs to 20, same optimizer, same loss function of cross entropy, so that the comparisons are fair.

Experiment 1: DNN with 2 layers (no regularization)

Experiment 2: DNN with 3 layers (no regularization)

Experiment 3: CNN with 2 convolution/max pooling layers (no regularization)

Experiment 4: CNN with 3 convolution/max pooling layers (no regularization)

Experiment 5+ : You will conduct several more experiments. (a) Redo all the 4 experiments with some regularization technique. (b) Create more experiments on your own by tweaking architectures and/or hyper parameters.

Result1: Create a table with the accuracy and loss for train/test/validation & process time for ALL the models.

Result2: Take Experiment 3 – Extract the outputs from 2 filters from the 2 max pooling layers and visualize them in a grid as images. See whether the ‘lighted’ up regions correspond to some features in the original images.

The CIFAR-10 dataset (Canadian Institute For Advanced Research) is a collection of images that are commonly used to train machine learning and computer vision algorithms. It is one of the most widely used datasets for machine learning research. The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes. The 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks. There are 6,000 images of each class.

The CIFAR-10 dataset
https://www.cs.toronto.edu/~kriz/cifar.html

Import packages needed

Verify TensorFlow Version and Keras Version

Mount Google Drive to Colab Environment

Loading cifar10 Dataset

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.

Preprocess Data For Model Development

The labels are an array of integers, ranging from 0 to 9. These correspond to the class of clothing the image represents:

Label Class_
0 airplane
1 automobile
2 bird
3 cat
4 deer
5 dog
6 frog
7 horse
8 ship
9 truck

Create Validation Data Set

Confirm Datasets {Train, Validation, Test}

Rescale Examples {Train, Validation, Test}

The images are 28x28 NumPy arrays, with pixel values ranging from 0 to 255

  1. Each element in each example is a pixel value
  2. Pixel values range from 0 to 255
  3. 0 = black
  4. 255 = white

Create the Model

Model and Performance Functions

Build CNN Model

We use a Sequential class defined in Keras to create our model. The first 9 layers Conv2D MaxPooling, Dropout handle feature learning. The last 3 layers, handle classification

Confusion matrices

Using both sklearn.metrics. Then we visualize the confusion matrix and see what that tells us.

sklearn.manifold.TSNE

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html

Experiment 1: 2 Dense layers

Experement 2: 3 Dense layers

Experiment 3: CNN with 2 convolution/max pooling layers (no regularization)

Experiment 4: 3 Convolution layers

Experiment 5

Experiment 6

Experiment 7

Experiment 8

Experiment 9

Experiment 10

Experiment 11

Experiment 12

Experiment 13

Experiment 14

Experiment 15

Experiment 16

Experiment 17

Experiment 18 : DNN with 5 dense

Experiment 18 A: DNN with 5 dense

Experiment 20

Experiment 21